data pair
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Oceania > Australia (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (2 more...)
Generative AI-enhanced Probabilistic Multi-Fidelity Surrogate Modeling Via Transfer Learning
Zeng, Jice, Barajas-Solano, David, Chen, Hui
The performance of machine learning surrogates is critically dependent on data quality and quantity. This presents a major challenge, as high-fidelity (HF) data is often scarce and computationally expensive to acquire, while low-fidelity (LF) data is abundant but less accurate. To address this data-scarcity problem, we develop a probabilistic multi-fidelity surrogate framework based on generative transfer learning. We employ a normalizing flow (NF) generative model as the backbone, which is trained in two phases: (i) the NF is first pretrained on a large LF dataset to learn a probabilistic forward model; (ii) the pretrained model is then fine-tuned on a small HF dataset, allowing it to correct for LF-HF discrepancies via knowledge transfer. To relax the dimension-preserving constraint of standard bijective NFs, we integrate surjective (dimension-reducing) layers with standard coupling blocks. This architecture enables learned dimension reduction while preserving the ability to train with exact likelihoods. The resulting surrogate provides fast probabilistic predictions with quantified uncertainty and significantly outperforms LF-only baselines while using fewer HF evaluations. We validate the approach on a reinforced concrete slab benchmark, combining many coarse-mesh (LF) simulations with a limited set of fine-mesh (HF) simulations. The proposed model achieves probabilistic predictions with HF accuracy, demonstrating a practical path toward data-efficient, generative AI-driven surrogates for complex engineering systems. Email address: David.Barajas-Solano@pnnl.gov (David Barajas-Solano) Introduction High-fidelity (HF) computer modeling using discretization schemes such as the finite elements (FE) method provides a rigorous framework for analyzing and predicting the behavior of complex engineering systems.
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > United States (0.04)
- Materials > Construction Materials (0.69)
- Energy (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.71)
Simulation-Free Training of Neural ODEs on Paired Data
In this work, we investigate a method for simulation-free training of Neural Ordinary Differential Equations (NODEs) for learning deterministic mappings between paired data. Despite the analogy of NODEs as continuous-depth residual networks, their application in typical supervised learning tasks has not been popular, mainly due to the large number of function evaluations required by ODE solvers and numerical instability in gradient estimation. To alleviate this problem, we employ the flow matching framework for simulation-free training of NODEs, which directly regresses the parameterized dynamics function to a predefined target velocity field. Contrary to generative tasks, however, we show that applying flow matching directly between paired data can often lead to an ill-defined flow that breaks the coupling of the data pairs (e.g., due to crossing trajectories). We propose a simple extension that applies flow matching in the embedding space of data pairs, where the embeddings are learned jointly with the dynamic function to ensure the validity of the flow which is also easier to learn. We demonstrate the effectiveness of our method on both regression and classification tasks, where our method outperforms existing NODEs with a significantly lower number of function evaluations.
Semi-Supervised Neural Architecture Search
Neural architecture search (NAS) relies on a good controller to generate better architectures or predict the accuracy of given architectures. However, training the controller requires both abundant and high-quality pairs of architectures and their accuracy, while it is costly to evaluate an architecture and obtain its accuracy. In this paper, we propose SemiNAS, a semi-supervised NAS approach that leverages numerous unlabeled architectures (without evaluation and thus nearly no cost). Specifically, SemiNAS 1) trains an initial accuracy predictor with a small set of architecture-accuracy data pairs; 2) uses the trained accuracy predictor to predict the accuracy of large amount of architectures (without evaluation); and 3) adds the generated data pairs to the original data to further improve the predictor. The trained accuracy predictor can be applied to various NAS algorithms by predicting the accuracy of candidate architectures for them. SemiNAS has two advantages: 1) It reduces the computational cost under the same accuracy guarantee. On NASBench-101 benchmark dataset, it achieves comparable accuracy with gradient-based method while using only 1/7 architecture-accuracy pairs.
Functional Indirection Neural Estimator for Better Out-of-distribution Generalization
The capacity to achieve out-of-distribution (OOD) generalization is a hallmark of human intelligence and yet remains out of reach for machines. This remarkable capability has been attributed to our abilities to make conceptual abstraction and analogy, and to a mechanism known as indirection, which binds two representations and uses one representation to refer to the other. Inspired by these mechanisms, we hypothesize that OOD generalization may be achieved by performing analogy-making and indirection in the functional space instead of the data space as in current methods. To realize this, we design FINE (Functional Indirection Neural Estimator), a neural framework that learns to compose functions that map data input to output on-the-fly. FINE consists of a backbone network and a trainable semantic memory of basis weight matrices.
Neural network initialization with nonlinear characteristics and information on spectral bias
Initialization of neural network parameters, such as weights and biases, has a crucial impact on learning performance; if chosen well, we can even avoid the need for additional training with backpropagation. For example, algorithms based on the ridgelet transform or the SWIM (sampling where it matters) concept have been proposed for initialization. On the other hand, it is well-known that neural networks tend to learn coarse information in the earlier layers. The feature is called spectral bias. In this work, we investigate the effects of utilizing information on the spectral bias in the initialization of neural networks. Hence, we propose a framework that adjusts the scale factors in the SWIM algorithm to capture low-frequency components in the early-stage hidden layers and to represent high-frequency components in the late-stage hidden layers. Numerical experiments on a one-dimensional regression task and the MNIST classification task demonstrate that the proposed method outperforms the conventional initialization algorithms. This work clarifies the importance of intrinsic spectral properties in learning neural networks, and the finding yields an effective parameter initialization strategy that enhances their training performance.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Oceania > Australia (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (3 more...)